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Abstract. We review how the Square Kilometre Array (SKA) will address fundamental questions in cosmology, 
focussing on its use for neutral Hydrogen (HI) surveys. A key enabler of its unique capabilities will be large (but 
smart) receptors in the form of aperture arrays. We outline the likely contributions of Phase-1 of the SKA (SKAi), 
Phase-2 SKA (SKA2) and pathfinding activities (SKAq). We emphasise the important role of cross-correlation 
between SKA HI results and those at other wavebands such as: surveys for objects in the EoR with VISTA and 
the SKA itself; and huge optical and near-infrared redshift surveys, such as those with HETDEX and Euclid. We 
note that the SKA will contribute in other ways to cosmology, e.g. through gravitational lensing and Ho studies. 



1. Introduction 

Over the last two decades, observations of the Cosmic 
Microwave Background (CMB) have revolutionised our 
understanding of the Universe. They have confirmed the- 
oretical predictions that the seeds of all structure are 
present ^ 0.4 Myr (z ~ 1100) after the Big Bang, in the 
form of tiny (^ 1 part in 10^) fluctuations in an otherwise 
smooth, featureless Universe. At this stage, gravity has 
had only a limited time to amplify these fluctuations, so 
their power spectrum P{k), imprinted as an angular pat- 
tern in the CMB, has proven rich in cosmological infor- 
mation. This includes an oscillatory (with scale) signature 
due to 'Baryon Acoustic Oscillations!^, and other features 
like the ratio of power on large and small physical scales 
that informs on neutrino masso. 

Here we argue that the Square Kilometre Array (SKA, 
Rawlings & Schilizzi'2011!) will be a key facility in answer- 
ing the questions of how the Universe formed its structure 
- galaxies, black holes, and stars. It will allow astronomers 
access to epochs between z ~ 20 — 30 and z ~ 6 when 
energy sources from the forming structure drove funda- 
mental changes in the neutral Hydrogen (HI), ending in 
an Epoch of Reionization (EoR) around z ^ 10 when the 
HI between galaxies became reionized (e.g. Zaroubi [201ip . 

We will also argue that the SKA is poised to become 
the premier tool for probing the large-scale structure of 



^ BAOs are frozen-in plasma oscillations, or sound waves, 
from the pre-recombination Universe that, observed through 
cosmic time, act as a cosmic 'standard ruler'. 

^ Higher neutrino mass, and hence higher energy density in 
the known cosmic number density of neutrinos, means less clus- 
tering, and power on small scales, because neutrinos, a form of 
hot dark matter, 'free-stream' out of Cold-Dark-Matter (CDM) 
condensations. 



the Universe after the EoR, using features in P{k) and 
other statistical measures, to understand the properties of 
dark energy or post-Einstein gravity, to measure neutrino 
masses, and to study the causes of cosmic inflation. 



2. Surveying the Universe 

Table [I] gives the number of independent (Fourier) 
modes of the Universal density field around a comoving 
wavenumber k = 27r/a; = 0.125 Mpc~^ corresponding to 
a length scale a: = 50 Mpc. At z = 0, this scale lies at the 
edge of, the 'linear regime' where the matter over-density 
(5^1, but, at higher z, it lies comfortably within the lin- 
ear regime and is smaller than the first two 'wiggles' in the 
critical BAO signature. Table [1] considers various sky ar- 
eas and redshift depths relevant to existing, or upcoming, 
cosmological surveys; similar types of data are presented 
graphically in Loeb & Wyithe (.2008) . 

The surveys given as examples in Table [T] probe the 
density field in different ways. Optical and near-IR red- 
shift surveys (SDSS, BOS^[ BigBOS^I, Euchd [Laureijs 
et al.J^nOi, HETDEX [Hill et al. ^l^^), and 'thresh- 
oldedn SKA surveys detect individual galaxies that are 
taken to Poisson-sample the underlying field. For these 
surveys, regions where nVP{k) ^1 (n is the comoving 
density of survey galaxies) is the condition for shot noise to 
be sub-dominant, and regions where nVP{k) ^ 1, allow 
the capability of checking for systematic errors by com- 
paring results from different sub-sets of the data. 



^ http:/ /www. sdss3.org/surveys/boss.php/ 
http:/bigboss. lbl.gov/ 

Catalogued surveys formed from detecting 'islands' of HI 
emission at na, with n ~ 5 — 10, above an experimental back- 



ground with r.m.s. a. 
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Table 1. Measures of the density field in the Universe possible with various upcoming surveys (those with power- 
spectrum accuracies marked in bold will allow transformational progress). Column 1: redshift range of spectroscopic 
survey. Column 2: sky area of survey (in sr). Column 3: number of independent Fourier modes A^modcs in the survey 
(covering cosmic comoving volume V) over a factor of e spread in k centred on fc = 0.125 Mpc~^ (equivalent to 50 
Mpc in comoving units); this is calculated using A'modcs = [^/(27r)'^] x 2nk^ . Column 4: Fractional error in measured 
power spectrum P{k) =< \6k\'^ > (where Sk are the Fourier coefficients of the transform of the real-space over-density 
6 = dp/p), given by l/Vi^modos, assuming the survey is limited by cosmic variance (rather than shot or experimental 
noise) - this column is left blank whenever the example surveys are not in this regime (or are essentially 2D surveys, 
as is the case for the CMB); in practice, P{k) will be measured in several, say 4, redshift bins, with correspondingly 
lower accuracy, say factor ~ 2, in each bin. Column 5: survey mentioned in the text. Although there are other planned 
ground-based optical surveys, we consider HETDEX, BOSS and BigBOSS as examples; for space-based near-IR redshift 
survey data, we consider Euclid. 

^ Cosmic-variance-limited mapping will become possible over a limited redshift range provided the SKA Advanced 
Instrumentation Program (see Rawlings & Schilizzi I201ip generates the capability for early deployment of a small frac- 
tion of the SKA2 mid-frequency AAs in a new core embedded within SKAi (see Section |3}. 

^ A large amount of mode averaging is needed to obtain a statistical detection of the structure encoded in P{k). 

^ Assuming ~ 10 per cent of the sky is unusable because of residual Galactic Plane contamination. 



CMB and 'non-thresholded' SKA surveys integrate all 
contributions to temperature (surface brightness) fluctua- 
tions 5T/T in the sk}|j. In the case of the SKA surveys this 
works by adding up all the contributions from HI emission 
inside and between galaxies, regardless of any threshold- 
ing criteria: after reionization, this probably still amounts 
to counting galaxies, as neutral HI can only exist in the 
dense self-shielding environments in or near galaxies. 

Thus, analysis of non-thresholded SKA data, or 'HI in- 
tensity mapping' (Chang et al. [MI51 Chang et al. I^OTU)) . 
has the potential to utilize many more modes of the den- 
sity field than a thresholded survey: much larger signals 
in the nVP{k) < 1 regime becomes available in the form 
of summed HI emission modulated by the cosmic web of 
large-scale structure. The fluctuations around the abso- 



^ The contributions to 5T /T in the CMB and (SKA) HI 
signals are complicated: in the case of CMB because there are 
so many primary and secondary cause of CMB fluctuations; in 
the case of HI because there are many astrophysical processes 
capable of generating HI anisotropies. Both CMB and (SKA) 
HI experiments will have to deal with the subtraction of bright, 
complex and polarized foreground emissions: the successes of 
CMB astronomy in overcoming such difficulties provide both 
hope and lessons for HI studies with the SKA. 



lute brightness temperature level Tb of the HI sky is given 
by (Furlanetto et al. 1^0071 

Sn - 9SxH (1 + z) gy^/g^ mK, (1) 

where xh is the fraction of Hydrogen that is neutral, and 
the fraction involving spin temperature Tg and CMB tem- 
perature TcMB saturates (at unit value) once the IGM 
has been heated by astrophysical sources; the last frac- 
tion, involving the Hubble parameter H{z) and the radial 
gradient in radial velocity, accounts for the more subtle ef- 
fects of redshift-space distortions. After reionization, the 
power spectrum of fluctuations traces the power spectrum 
of the underlying fractional over-density S in the matter 
power spectrum. During reionization, on small (arcmin, 
or comoving-Mpc) scales, the signal level is boosted by 
the formation and percolation of bubbles of ionized ma- 
terial around photon-producing sources and other effects 
(Furlanetto et al. I2007[) . After reionization, measurements 
of the HI associated with the damped Lya absorption lines 
of quasars suggests xh ~ 0.02 with no strong dependence 
on redshift (Trenti & StiaveUi lMJB)) . 

Roughly speaking, once the EoR ends at z ^ 6, the 
neutral fraction xh 0.02, so Equation [1] shows that the 
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Fig. 1. Estimated temperature of HI fluctuations as a function of angular scale reproduced from Furlanetto & Briggs 
(|2004p . The gently curved red lines represent 3cr intrinsic HI fluctuations in 5v = 0.5 MHz bins assuming the IGM is 
neutral, e.g. xh — 1 [clearly a bad assumption at z = 6.1 where a value xhi ^ 0.02 would scale the red line down 
by a factor ~ 50]. The solid lines show la sensitivities predicted in 100- and 1000-hour exposures that lie very close 
to predictions for SKAi (the dotted lines should be ignored). Real data from the GMRT (Paciga et al. I2010p have 
produced upper limits at the 100 mK level. 



modulation (e.g. of density fluctuations of amplitude 5 ~ 1 
around just-collapsing structures) produces temperature 
fluctuations with an amplitude ^ 500[(1 + z)/7]^/^ ^iK. 
This is in good agreement with values inferred from the 
cross-correlation of HI and optical data by Chang et al. 
(|2010[) at redshift z ~ 0.8. During reionization, detailed 
models (e.g. Furlanetto et al. I2007p . constrained by CMB 
(WMAP) measurements of the column of ionized material 
towards the CMB, suggest amplitudes ~ 10 mK. 

EoR experiments are very challenging as they must 
understand their error budgets sufficiently to assign any 
'excess' (above thermal noise) variance in their data to in- 
trinsic HI fluctuations rather than, e.g., RFI or residuals 
from ionospheric calibration and foreground removal. As 
has proven to be the case in attempts to make statistical 
detections of HI at more modest redshifts, e.g. z ~ 0.8 
(Chang et al. I2010p . convincing 'auto-correlation detec- 
tionsQ of HI variance may be much harder to obtain 



In an (RA, DEC, z) data cube from any radio telescope 
with well understood sources of (non-astrophysical) variance, 
the normal ergodic assumption means that variance calculated 
statistically across spaxels is equivalent to an 'auto-correlation' 
< p X p > where p represents the signal in each spaxel. A cross- 
correlation signal, e.g. < pi x p2 > from two independent data 
cubes, e.g. a radio HI cube with a signal (in each spaxel) pi and 
an optical galaxy density cube with signal (in each spaxel) p2 , 
is much less prone to systematics as sources of variance (other 
than astrophysical signal) should be independent and average 
to zero in the cross-correlation. Note also that higher-order 



than 'cross-correlation' detections where the uncorrelated 
errors in two ways of probing the EoR may be exploited to 
obtain robust measurements of the astrophysical signal^. 

It should also be noted that HI power-spectrum meth- 
ods are not yet proven on astrophysical datasets, and be- 
ing inherently statistical in nature, will be challenging to 
establish. Thresholded surveys allow the use of the data 
itself to check for systematics or, through measured red- 
shift space distortions, to cleanly marginalise over galaxy 
bias (the scale factor between fluctuations in HI and fluc- 
tuations in dark matter, see also Abdalla et al. I2010p . 

3. The phased construction of the SKA and its 
emerging role in cosmology 

The afFordability of building and operating the SKA is 
strongly dependent on the solutions adopted for the sig- 
nal and data transfer and processing, and strongly justifies 
the decision to build the SKA in phases. This argument is 

statistics, such as skewness, may helpful in extracting true HI 
signal (Harker et al. 2009). 

* There are astrophysical signals beyond the Milky Way that 
must be removed (e.g. Jelic et al. |2008| , and cross-correlation 
methods may be useful in separating these from the HI sig- 
nal: the radio background due to 'foreground' radio sources is 
more than three-orders-of-magnitude brighter than the HI sig- 
nal (Bridle [1967|, and not smooth on the sky or in redshift due 
to clustering; this source of variance will be strongly correlated 
with the location of galaxies in deep near-infrared images. 
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particularly clear for HI cosmology experiments that in- 
volve both the aperture array (AA) and dish parts of the 
SKA. Other arguments regarding the phased construction 
of SKA dishes come from the requirements of pulsar as- 
tronomy fKramer l201ip . For convenience we use the term 
SKAq here to refer to scientific and technical verification 
results imminent from the SKA pathfinder programs [see 
Rawlings et al. (|201ip for a discussion of some of the key 
pathfinder programs, and for descriptions of the SKAi and 
SKA2 realizations assumed herein]. 

The necessary evolution in AAs from SKAq to SKAi 
is best appreciated by studying an aerial photograph of 
the LOFAR cor^l. This shows a 'superterp' of diameter 
Dcorc = 0.34 km, but, at 130 MHz, it is obvious from the 
small density of high-band antennae stations that the ar- 
ray has a low fiUing factor (^f ~ 0.02 out to r = 0.5 km) 
showing that pathfinder projects are approaching, but still 
some way off, the scale of the SKAi AA core. Critically, 
however, LOFAR is already being used for both astron- 
omy and technology verification. The SKAi core will im- 
prove on SKAq in a number of ways: (i) it will be situ- 
ated in a radio-quiet zone in Australia or South Africa, 
minimising the need for RFI flagging of the data [which 
can greatly complicate statistical measurement of the HI 
power spectrum as it is a source of data variance along 
the frequency, or redshift, axis] and minimising the num- 
ber of bits required in the early digitisation [a construc- 
tion and operating cost driver for the SKA] (ii) It will, 
with Dcorc =1 km, have ^ 10-times the physical area of 
the superterp; (iii) it will have a much higher filling fac- 
tor (/Xf ~ 0.8 c.f. /Xf ~ 0.02); and (iv) its methods of 
beam forming and correlating signals may be entirely in 
the digital domain, allowing greater control of systematic 
errors and calibration. These points mean that in terms 
of temperature sensitivity at 130 MHz (HI at z = 10) 
and 9-arcmin (25 Mpc comoving) resolution, SKAi will 
outperform SKAq by a factor ~ 100, making it the first 
instrument capable of high-fidelity imaging of both the in- 
trinsic EoR fluctuations and the foregrounds that must be 
accurately removed to reveal them. 

3.1. The Dark Ages and the EoR 

The information in the radio sky arrives on Earth as waves 
of wavelength A from all directions above the local hori- 
zon. With a future detection of HI signal from the EoR 
as the astronomers' first step into the dark ages, SKAq 
interferometer experiments are already interrogating this 
information to seek statistical detections of HI fluctuations 
in the redshift range 6 ^ z ^ i;^^ Fig. [T] illustrates that the 
'sweet-spot' for detection and imaging the EoR must be 



^ http:/ /www.lofar.org/about-lofar/image-gallery/latest-lofar- 
Other experiments, e.g. Bowman & Rogers l|2UlUp . are 
using much smaller numbers of receivers targeted at abso- 
lute, rather than differential, detection of HI, in the form of 
a monopole step-change in sky brightness with redshift due to 
reionization. They also target the z = 13 — 6 range as this 



somewhere near the centre of this redshift range: at z ~ 6 
the IGM is known (from optical absorption features in 
quasars) to be almost entirely ionized (Ouchi [201ip . and 
by z ~ 13 (100 MHz) the sky temperature rises are start- 
ing to strongly degrade the signal-to-noise ratio of any 
proposed experiment. Also clear from Fig. [T] is that at an 
angular scale around ~ 9 arcmin (25 Mpc, comoving, at 
z ^ 10) is optimum as, for a fixed number of receiver 
chains, there is a rapid loss of temperature sensitivity to- 
wards smaller angular scales, tensioned against a gentler 
drop in HI signal level towards larger angular scales. 

To image the sky to detect HI at z ^ 10 (HI redshifted 
to A = 2.3 m or 130 MHz) with high fidelity, it is necessary 
to space (many of) the AA elements by ^ A/2 = 1.15 m 
so as to form a dense aperture array that Nyquist samples 
the incident wavefront and hence avoids 'grating lobes' 
above the horizon. The central (-Dcorc = 1-kni diameter) 
core of the AA part of the SKA can then be considered 
as a single large (smart) aperture that has a natural FOV 
(or 'core beam') of angular width « 1.22X/D ^ 9 arcmin 
that, by suitable addition of phase- weighted signals from 
the antennae, can be one of many beams [these can in- 
crease FOV and, potentially, adaptively null sources of 
RFI], or by suitable cross-correlation of antenna signals 
can continuously map out bright sources, and hence mon- 
itor ionospheric conditions across the sky. 

The SKAq AA experiments are competing to obtain 
the first statistical detection of the HI EoR fluctuations. 
These all aim to achieve this by averaging together all the 
independent Fourier modes of the HI distribution that 
they can measure in the plane of the sky and in the 
frequency, or redshift, direction, to measure the power- 
spectrum (scale dependence) of the HI fluctuations. In 
principle, the gains of a power spectrum approach are 
huge (see Table [T]), with, say, considering 25-Mpc (9 ar- 
cmin) modes over a survey of 20 deg^ (which is comfort- 
ably within the FOV of the LOFAR and MWA high-band 
analogue beam formers) giving, transverse to the line-of- 
sight, -\/iVmodcs V900 = 30 that, averaging also over a 
cube (in comoving coordinates) in the redshift direction 
give a total mode-averaging boost in total signal-to-noise 
ratio of ^ 165. This gain can grow further with additional 
redshift binning or sky coverage, explaining why statisti- 
cal detections remain highly plausible (e.g. Zaroubi r201ip 
despite the low temperature sensitivities (/if ^0.01 over 
-Dcorc = 1 km) of SKAq experiments (FiglT]). 

There is also the possibility of exploiting cross- 
correlation techniques in the EoR. A stacking analvsij"! 
could be based on Lya-emitting galaxies (e.g. Ouchi 20TT|) : 

comfortably brackets uncertainties from CMB studies regard- 
ing when the Universe transitioned from xh ~ 1 to xh 0.02. 

A stacking analysis - cross-correlation of the radio data 
■imtifeea/set of 3D Dirac Delta functions centred on the (RA, 
DEC, 2) coordinates of objects from other surveys - is the sim- 
plest variant of such methods. These methods will be particu- 
larly challenging in the case of Lya surveys because resonant 
HI absorption strongly affects the line profiles and centroids, 
and requires a sophisticated stacking technique. 
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an advantage here is that the advent of wide-field narrow- 
band near-IR capabilities with telescopes such as VISTA 
should deliver 1, 000s oi z ^ 7 objects across EoR fields 
over the coming years. The disadvantages of this approach 
include: a limited likely redshift span of large samples of 
objects due to the increased rarity and faintness of suit- 
ably bright Lya emitters at higher z; and likely redshift- 
dependent switches in the sign of the expected cross- 
correlation signal. For example, at z ^ 7, the positive cor- 
relation expected because HI emission is concentrated in 
galaxy haloes, reverses to a negative correlation at z ^ 10 
when ionized bubbles begin to percolate the Universe, and 
the HI signal may peak away from the sources of Lya pho- 
tons and heat. We emphasise that achieving an angular 
resolution of at least the ^ 1— arcsec level of seeing- limited 
optical data is critical because only then can sign-switches 
in cross-correlation analyses be used as useful information 
rather than as a source of confusing variance. 

Another potential cross-correlation method in the EoR 
is between HI and gas-rich galaxies detected by radio 
telescopes at high frequency: e.g. through the molecular 
CO (1-0) line with SKAq. This is discussed by Heywood 
et al. (|2011ap but we emphasise here that the dish-part 
of SKAi may be needed to get 1000s of tracers across the 
EoR fields with a broad range of redshifts where galax- 
ies are present and detectable; this requires SKAi dishes 
to have good efficiency up to at least 15 GHz [CO(l-O) at 
z ~ 6.5]. Again, the HI data cubes will need ~l-arcsec res- 
olution for efficient cross-correlation experiments. There 
are plans to construct new instruments aimed at adding 
'CO intensity mapping' (e.g. Gong et al. 120111 Lidz et 
al. I201ip to the armoury of techniques for studying the 
EoR. We also note plans to cross-correlate HI signals with 
CMB data (e.g. Tashiro et al. 2010) that would reveal ef- 
fects such as an expected correlation between regions of 
low HI column density, and hence high ionized density, 
with large CMB optical depth. 

The various SKAq low-frequency- AA experiments are 
also comparing and contrasting various technical solutions 
relevant to the working of a 'large-but-smart' highly-filled 
aperture, but there are important implementation details 
rather than technical show-stoppers holding up plans to 
build an SKAi core. SKAi will move the problem into a 
new regime where the HI signal targeted has fluctuations 
that on the relevant angular scales exceed the fluctuations 
due to thermal noise. The scientific and technical think- 
ing behind the necessary leap in sensitivity required to go 
from cross-correlation, or tentative auto-correlation, de- 
tections of the EoR (all that is possible with SKAq) to an 
imaging instrument are illustrated in Fig.[TJ The key point 
is that, in feasibly long exposures with SKAi, astronomers 
will reach the regime where the signal-to-noise ratio of HI 
fluctuations and the foregrounds are all much greater than 
unity, meaning that direct mapping of EoR HI features 
becomes much more tractable in the presence of polarized 
foregrounds, ionospheric effects and RFI. If SKAq does 
not yield any definitive auto-correlation detections of HI 
in the EoR, constructing SKAi will be the only way for- 



ward in this research area; if SKAq does detect HI EoR 
statistically, via auto- or cross-correlation techniques, the 
lessons of CMB astronomy tell us that instruments capa- 
ble of directly imaging the HI fiuctuations will be urgently 
needed and rapidly exploited. 

The scaling up from the AAs in SKAq crudely follows 
'Moore's law' in that from 2008 (the start of the construc- 
tion of the LOFAR core) to the start of construction of 
the SKAi core in 2016 - i.e. roughly 7.5 years, or 5 lots 
of 18 months - the required number of antennae should 
be able to grow by a factor ^ 2^ = 32 that comfort- 
ably exceeds the increase in core antennae: from ^ 18, 400 
LOFAR high-band dipoles to ~ 280, 000 SKAi dipoles. 
There will be increased central concentration of the col- 
lecting area, bringing with it other major cost savings on 
data transport and infrastructure that should allow an 
all-digital solution (retiring the need for analogue beam- 
forming that feeds the outputs of 16 antennae in LOFAR 
and MWA high-band tiles to a single receiver chain). The 
SKA project has developed sophisticated cost-estimation 
tools (Bolton et al. I2009P that lends great confidence that 
such an AA system can be built and operated within the 
SKAi budget restrictions. 

Fig. [T] shows that SKAi will have sufficient tempera- 
ture sensitivity in a 1000-hr exposure to image fiuctua- 
tions. The most efficient and reliable way of covering each 
of a few, say 5, 20 deg^ patches is to build a hierarchical, 
but preferably still all-digital, beam-forming and correla- 
tion solution: following Dewdney et al. (|2010p . with 180- 
m diameter stations, each with a 50-arcmin station beam, 
requires only iVstation ^ 30 to ensure continuous sky cov- 
erage with station-beam overlap, and outrigger beams for 
calibration and RFI excision. With 25 of 50 AA stations 
in the core (radius r < 0.5 km), the core filling factor 
of = 0.8 (c.f. fif ~ 0.01 for SKAq) provides excellent 
temperature sensitivity on large angular scales, while 10 
further stations in the 'inner' region (0.5 < r < 2.5 km) 
will give sensitive imaging capability at resolutions down 
to ~ 2 arcmin (see Fig. [1]); 15 further stations in 3 spiral 
arms out to r = 100 km provides the resolution essential 
for the removal of point and extended foreground sources. 
Table [T] shows that SKAi will produce a cosmic- variance- 
limited measurement of the power-spectrum at z « 10 
that is comparable in accuracy to measurements of P{k) 
from SDSS in the local Universe. SKAi will be able to 
map out the evolution of the HI signal over the full range 
6^z^l3: probing rapid changes in xh, as well as other 
factors in Equation [1] with cosmic time will mean that 
astronomers will learn huge amounts about the processes 
happening during this key epoch. However, despite signif- 
icantly enhancing knowledge of the reionization processes 
(and its effect on CMB data), this experiment is unlikely 
to produce game-changing measures of cosmological pa- 
rameters. 

An EoR-optimised AA in SKAi will also be used to 
attempt statistical detections of HI at z > 13, but the 
Tgys ~ TskyA^'^^ scaling at low frequency makes this a 
challenge. Putting aside the challenges of ionospheric cal- 
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ibration and foreground removal, that also worsen con- 
siderably with increasing A, imaging looks challenging at 
z > 13 (Fig.[T|). However, simulations (Santos et al. I2011|) 
predict that the fluctuation signal may be boosted from 
the 10 mK EoR level (Fig. [T]) to, perhaps, the 100 
mK level, meaning it might be possible to measure the HI 
power spectrum in the Dark Ages. Adopting the conser- 
vative prediction for the z = 18 signal in Fig. [U and real- 
ising that it would only take a few station beams to map, 
say, 20 deg^, then by mode averaging, it should be pos- 
sible to attempt significant statistical detections across a 
few independent sky patches. These observations are crit- 
ical for understanding how the HI signal reflects the flrst 
stars and black holes ahead of the EoR: measuring, and 
potentially even mapping, the predicted ^100 mK level 
fluctuations that would conflrm the underlying conjecture 
of strong highly-spatially- clustered HI absorption due to 
cold (Tg ^ TcMB ', see Equation [T|) absorbing material 
near to the first galaxies (Barkana & Loeb 2004,) . 

In EoR and Dark Ages studies, the chief SKA2 sci- 
ence drivers would then be to move from the limited res- 
olutions, sky areas and redshift ranges observable with 
SKAi: higher resolution is needed to map ionized struc- 
tures directly associated with quasars and star-forming 
galaxies; covering a large fraction of the sky is essential 
for power-spectrum sensitivity and cross-correlation with 
CMB; and 13 < z < 20 - 30 (100 MHz down to - 70 - 45 
MHz) provides new information on the Universe. These 
can all be achieved simultaneously by, in SKA2, building 
out the high-filling- factor AA from the core (r < 0.5 km) 
into the inner region (0.5 < r < 2.5 km), so that with long 
(1000 hr) exposures, there is suflicient sensitivity on few- 
arcmin scales to image ~ 1 mK fluctuations in the EoR, 
and work, at least statistically, in the Dark Ages. 

This should delineate various stages expected in the 
Dark Ages: strong absorption associated with the first 
stars; hot IGM, and hence HI emission, (Tg ^ Jcmb, see 
Equation [1]) near the first black holes, and cold absorbing 
IGM surrounding these regions (Pritchard & Furlanetto 
I2007|) : and clear views of the subsequent process of reion- 
ization. For EoR fluctuations, SKA2 will have a mapping 
speed gain of ^ 25 from sensitivity and ^4—10 from 
extra beams (and hence FOV) sufficient to obtain cosmic- 
variance limited imaging over a significant fraction of the 
sky. Table [T] shows the low error bars on P{k) that will 
result. It is worth noting that because of the large sky 
temperatures at these low frequencies, that the SKA2 core 
may need to have a collecting area approaching ^10 km^! 

The removal of polarized Galactic foreground from 
EoR and Dark Age observations will need broad wave- 
length coverage to account for complex angular-scale de- 
pendent effects such as Faraday rotation and depolariza- 
tion, and, together with the demands of pulsar surveys 
(Kramer [20TT|) . underpin the required SKAi (~ 70 - 450 
MHz) frequency range. It is therefore inevitable that 
SKAi will address many of the goals of the SKA mag- 
netism key science project (Gaensler et al. i2004|l . 



3.2. The post-EoR Universe 

In the post-EoR Universe, the dish-based arrays combined 
here under the term SKAq will make great strides towards 
understanding how the the HI in galaxies traces the under- 
lying dark matter: e.g. ASKAP and WSRT/APERTIF are 
likely to generate all-sky HI surveys approaching in size 
the ^10® galaxy surveys that current optical surveys like 
SDSS have delivered; 'deep' (to z « 0.2) surveys across 
regions heavily studied by optical redshift and near-IR 
imaging surveys, will allow sophisticated cross-correlation 
analyses. Such efforts will be complemented by deep HI 
stacking and absorption-line experiments with MeerKAT, 
so the nature of HI evolution between ^ 2: < 1 may be 
moderately well understood ahead of SKAi operation. 

A first goal of the dish part of the SKAi will be a 
(largely thresholded) survey of HI galaxies between z = 
0.2 and z < 2, requiring 1000 hours of exposure. Assuming 
FOV extension from Phased-Array Feeds (PAFs) from 
the SKA Advanced Instrumentation Program (AIP) by 
a factor ~ 10 over that of a single-pixel feed at 700 MHz 
(z = 1 HI), then 40 deg^ of sky coverage (independent 
of frequency) per patch is plausible, yielding 200 deg^ in 
five independent sky patches. Whilst not competitive with 
BOSS in terms of volume (see Table[T|), the overlap regions 
between SKAi and BOSS would establish any limitations 
imposed on either method due to systematic errors. 

Another main science driver for SKAi will be to push 
this type of detailed radio-optical cross-correlation work 
to the z ~ 2 — 3 regime. Here, the next-generation optical 
redshift surveys (e.g. HETDEX, Hifl et al. wifl be 

attempting to assemble the flrst large (~ 10^-galaxy-sized) 
samples. These are critical epochs because they probe be- 
fore dark energy is thought to make a major contribution 
to Universal expansion, so that other critical cosmologi- 
cal parameters, such as the Universe's intrinsic curvature, 
can be constrained independently. In the equatorial over- 
lap between (the mostly Northern Hemisphere) HETDEX 
and an AA survey with SKAJH, HETDEX expects to dis- 
cover <; 2 X 10^ objects in the redshift range 2 < z < 3 over 
100 deg^ near the celestial equator. A 1000-hr SKAi ex- 
posure with SKAi achieves, at 450 MHz [z = 2.1) using 
A As, ^ Icr detections of galaxies near the break of the HI 
mass function. A sky area of ^ 100 deg^ could plausibly 
be covered instantaneously by AAs (if A^boam = 1500). 
With, conservatively, 10^ Lya objects, and stacking tech- 
niques, the signal-to-noise ratio for HI at z ~ 2 would be 
^ 300, yielding the ability to split the sample by galaxy 
type (e.g. using near-IR estimates of stellar, and hence, 
dark matter mass). 



HETDEX does not have continuous sky coverage, so a 
truly combined survey is plausible as the ~ 15— arcmin size 
of the HETDEX 'tiles' is well matched to the low- frequency- 
station-beam size of SKAi at ~ 450 MHz. A design consider- 
ation for all AAs in SKA is that they are able to observe the 
celestial equator due to the concentration of other wavelength 
surveys (such as HETDEX) and capabilities there. 
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A direct measurement of the evolving HI galaxy power 
spectrum over at least some of the z ^ 2 — 6 range is also 
plausible and critical to pursue because Table [T] shows 
that there are many modes available for HI intensity map- 
ping with SKAi. Over this range of redshifts, the temper- 
ature of the fluctuations on a fixed comoving scale are 
expected to be roughly constant. The 'biaslij is inferred 
from Equation [T] and should increase from ~ 200 /iK at 
z ^ 2 to ^ 500 at z ^ 6. So, on a fixed comoving scale 
(say 50 Mpc), this compensates largely for the correspond- 
ing factor ^3 decline in expected signal due to the growth 
[at z ^ 2, the temperature fiuctuations on ~ 50 Mpc scales 
will be ~ 200 ^iK x as/g 70 fiK (taking as = 0.8 to be 
the effective normalization of the power spectrum on the 
relevant scale)]. Note, again, the observational confirma- 
tion of this rough scaling at z ~ 0.8 (Chang et al. i2010j) . 

Assuming good progress from the SKA AIP, a rela- 
tively small (say 15 station) SKA2-capable mid-frequency 
AA could be available near the time of SKAi first oper- 
ation With stations of diameter 56-m embedded in the 
SKAi configuration, and with antennae densely packed 
below - 500 - 800 MHz (Schilizzi et al. [MI7|) these would 
overlap in frequency with the SKAi low-frequency AAs 
over the redshift range 2.1 <z^ 3.5 (allowing foreground 
removal). Distributed over a core of diameter i^core = 300 
m (as the start of the SKA2 mid-frequency AA core) this 
would provide a core beam equivalent to ~ 25 comov- 
ing Mpc (sufficient to Nyquist sample the first three BAO 
'wiggles'; see also Chang et al. I2008P and /if ~ 0.5. This 
would provide an r.m.s. temperature sensitivity in a 1000- 
hr exposure of (assuming a redshift depth of ^ 25 co- 
moving Mpc, or ss 2.3 MHz, and a system temperature 
Tsys=50 K) w 20 /iK, meaning the thermal map noise will 
be ~ 3.5— times lower than the signals from HI structures, 
allowing mapping of these structures over, say, 20 deg^ 
[-^station <L 30]. Ovcr 3.5 ^ z ^ 6.0, the data quality will be 
compromised by the sparse nature of the (low-frequency) 
A A antennae at the relevant frequencies: detections via 
auto-correlation techniques might prove challenging, but 
are certainly not implausible. 

These considerations show that, in going from SKAi 
to SK A2 , it will be critical to use the results of the AIP to 
optimally enhance the mapping speed of the z < 2 ( > 470 
MHz) Universe with the SKA. The gain achieved will be 
by a factor ~100-10000, depending on the adopted AIP 
technology. Crudely, if the eventual SKA2 realization ex- 
pands only the existing (70-450 MHz) SKA AA element 
(i.e. no mid- frequency AAs are built), a factor ^ 10 in- 
crease in the number of single-pixel-feed dishes would en- 
hance the SKA mapping speed by 'just' 100, meaning a 
20,000 deg2 5cr-thresholded (- 10^ objects to z < 2) survey 
would be unfeasible (Abdalla et al. I2010p : the best that 
could be managed would be a thresholded ~ 100 deg^ 



The bias b is defined by , with units of mK^, as the ratio 
of the power spectrum of the temperature fluctuations to the 
product of the matter power spectrum and the square of the 
cosmic growth factor g 0.8(1 -I- z). 



survey to z ^ 2. This would be interesting as an adjunct, 
e.g. to the Euclid deep field, but would not constitute a 
game-changing experiment in cosmology, and would be 
uncompetitive with the wide-area near-IR redshift sur- 
vey with Euclid. If the AIP selects an SKA2 realisation 
including 250 x 56-m diameter densely-packed AA sta- 
tion^(Rawlings & Schilizzi I201ip . then the mapping 
speed gain would include a factor of ^ 100 from sensi- 
tivity increase, and a further factor of ^ 100 from FOV 
increase (for mid-frequency AAs over dishes!^, allowing 
'all sky' thresholded surveys (Tabled)). 

The resulting 'billion-galaxy' surveys are needed to ad- 
dress questions such as neutrino mass, that is measur- 
able to the lowest limit allowed by particle physics exper- 
iments at SKA2 sensitivity (Abdalla & Rawlings I2007p . 
and sub-per-cent accuracy on the dark energy w parame- 
ter or demonstration, via measuring the cosmic evolution 
of the growth factor g, of the need for post-Einstein grav- 
ity. All such experiments would be well serviced by SKA2 
galaxy power spectra (in several independent redshift bins, 
see Table [T|) achieving high signal-to- noise ratio on B AOs 
and other features. They would also allow marginalization 
over galaxy bias through measurement of velocity-space 
distortions (Abdalla et al. l2010p . 

Additionally, with foreground removal using both low- 
and mid-frequency AAs, there are prospects of applying 
the intensity mapping method with both cores to get 
a comparable measure of the 2^ z^6 power spectrum 
(Table[T]) - this could be critical for distinguishing between 
effects of dark energy (that have little local effect in the 
high-z Universe) and other effects such as small, but non- 
zero, intrinsic Universal curvature. Statistical studies of 
the density field beyond P{k), e.g. to high-order statistics 
or issues of primordial non-Gaussianity, can be applied to 
these large-nuniber-of-niode surveys, and by testing com- 
peting models of inflation (Jeong & Komatsu 2009), will 
start to address the next set of key questions in cosmology. 



The total antenna count in this mid-frequency aperture ar- 
ray would be, assuming A/2 sampling at 500 MHz (A = 0.6 m), 
(250 X TT X 56^)7(4 X (0.6/2)^) ~ 7 x 10®, and ~ (800/500)^ 
higher if, as preferred for imaging quality, the 500-800 MHz 
regime is entirely in the densely-packed regime (Schilizzi et 
al. I2007p . In broad-brush terms, these order-of-magnitude in- 
creases in antenna count (over SKAi) look plausible within the 
SKA2 funding envelope (Bolton et al. I2009p . This is certainly 
a case where crude scaling arguments have the potential to be 
dangerous, and the development of increasingly sophisticated 
costing tools up to the 2017 decision point on SKA2 will be- 
come an increasingly crucial aspect of the project. 

The 'half-way house' of using PAFs to expand the FOV 
of all the 15-m dishes is another possibility, but at 450 MHz 
(A = 0.67 m), a phased array feed (with 10-100 beams) would 
become, in size, a large fraction of the primary reflector of the 
dish, and the cost of data transport and correlation may prove 
prohibitive (see Sec. |3] for a potentially stronger cosmological 
science case for SKA2 PAFs). 
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4. Concluding Remarks 

From Table [1] we see that SKA is unique in future as- 
tronomy in allowing wide-field access to the very distant 
(z ~ 6 — 30) Universe when the first stars, galaxies and 
black holes formed and the Universe was reionized. The 
phased roll-out of the SKA means that its first results can 
be available alongside the first operation of the ELTs (and 
well-established facilities like ALMA, JWST etc) that will 
be available for studying objects in the EoR. However, the 
ways in which the HI fluctuations trace P{k) (Equation[T|) 
at very high redshift are probably too complicated to add 
much to questions like the mass of neutrinos, the dark- 
energy w parameter or post-Einstein gravity. 

With regard to allowing substantially improved cos- 
mological measurements over those already published, or 
those to be measured by BOSS and HETDEX, the most 
promising surveys appears to be those at z < 6 emphasised 
in bold text in Table [1] (see also Loeb & Wyithe 2008). 
Amongst planned experiments, only SKA, BigBOSS and 
Euclid have the ability to make cosmic-variance-limited 
measurements over the redshift range 0.2 < z < 2 where 
there are ~ 10^ modes available for study. The SKA sur- 
veys could have significantly higher effective volumes than 
Euchd or BigBOSS surveys (Kim et al. [20n|) . and will 
probe much of this volume in the nVP{k) ^ 1 regime 
(Abdalla et al. I2010p . This means that analyses of SKA 
data will have the opportunity to test the stability of re- 
sults to how the tracers of the underlying density field are 
selected. It is of course highly plausible that the combi- 
nation of SKA, BigBOSS and Euclid datasets will prove 
much more powerful together than separately: the poten- 
tial of eliminating systematics through cross-correlation 
may prove key to achieving the best-possible reductions 
in the error budgets on the cosmological parameters. 

Here, we have focussed on HI cosmology as it is HI 
and pulsar astronomy that drive the design of SKAi. The 
SKA may, however, contribute to cosmology in other ways 
(e.g. Sutherland et al. [20TT|) : 

Weak gravitational lensing. Although considered by 
Rawlings et al. (|2004p . this is generally not emphasised 
in the SKA science case due to the strong competition 
from optical facilities, either ground-based (e.g. LSST) 
or space-based (Euclid). It remains true that SKA has 
the potential to combine the large-sky-area coverage of 
LSST, with the superb control of the point-spread func- 
tion available only with space missions at optical wave- 
bands - and even then challenging if these observations 
are obtained with only limited colour information; SKA 
can also help with tomographic weak-lensing experiments 
using redshifts from its HI redshift surveys. SKAi will be a 
very useful weak lensing facility, and particularly so if the 
AIP yields a dish-FOV-extension technology opening up 
the possibility of ^ 20 deg^ surveys at lOOO-hr exposure 
depth at ^1-2 GHz frequencies where the PSF width is 
0.5-arcsec, as needed for weak-lensing experiments. In 
the spirit of the discussion of HI, it also remains plausible 
that the power of combining radio and optical experiments 



will be particularly crucial for pushing the determination 
of the cosmological parameters to the next level of ac- 
curacy. There are hints that this may be the case from 
the first attempts at joint radio-optical weak lensing that 
find stronger and cleaner signal in cross-correlation than 
in auto-correlation (Patel et al. I2010p . 

Measurement of Hq. SKA studies of distant water 
masers for Hq measurement was considered by Rawlings 
et al. (|2004p . with the caveat that the rest- frame frequency 
of water masers is 22 GHz, so large ranges of redshift will 
be unavailable unless the SKA dishes are operated at fre- 
quencies between 10-22 GHz - this is currently a goal, 
rather than a requirement, of the SKA design process. 
Strong-gravitational-lensing experiments (e.g. Koopmans 
et al. I2004P are also sensitive to Hq through differential 
time delays (given a lens model encoding details of dark 
matter in collapsed structures): this provides an exciting 
combination of constraints on dark energy and dark mat- 
ter that is worth a serious re-examination in the light of 
plans for SKAi (e.g. Heywood et al. l2011bp . 
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